T WE CLAIM IS : / 

1. A system for processing audio signals comprising: 

^ I (a) a splitter for dividing an input audio signal LAto a 
fipst and one or more secondary signal portions, which An 
combination provide a complete representation of the yCnput 
signal, wherein the first signal portion contains ir^ormation 
sufficient to reconstruct a representation of the iinput 
signal; / 

(b) a first encoder for providing encoded dfeta about the 
first signal portion, and one or more secondary encoders for 
encoding said secondary signal portions, wher/in said 
secondary encoders receive input from the fLrst signal 
portion and are capable of providing encoded data regarding 
the first signal portion; and / 

(c) a data assembler for combining Encoded data from 
said first encoder and sao^ secondary encoders into an output 
data stream. n,^ / 

2. The system of claim XL further comprising a decoder 
for reconstructing theN4:i^P^t signal/from information in said 
first signal portion. 

3. The system of claiw 1 wJierein dividing the input 
signal is done in the ^f requency /iomain, and the first signal 
portion corresponds to efc^ bas/ band of the input signal. 

i/l wherein said signal portions 
I different from that of the 



4 . The system of clc 



:arFp 2 further comprising one or 
or decoding information encoded by 



are encoded at samplin 
input signal . 

5 . The system of c 
more secondary decoder 
said secondary encoders* 

6. The system^^ claixn 1 wherein said first encoder 
and said secondary ^code^ are embedded encoders . 

7. The system clai^ 1 wherein said splitter is a 
filter bank. 

8. The sy^tem^of cl^m 1 wherein said splitter is a 
Fast Fourier Transform ^J'FT) computing device. 

9. They&ystem of cPaim 8 wherein said splitter divides 
the input signal into M octave bands. 
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10. The system of clad>tQ 9 further comprising 
decoders, 1 s Ml s M, for providing an output signal that 
reconstructs the input ^dgnal from information irf Ml signal 
portions of the input sign^ . 

11. The system of cladNn 10 wherein the ^utput signal 
has sampling frequency \^tmt is 2"^ times low^rr than the 
sampling frequency of the i^nput signal. 

12 . The system of claim 1 wherein s^&id output data 
stream comprises data packets suitable ^or transmission over 
a packet -switched network^ 

13 . The system of claihn 12 wh^ein said data packets 
are prioritized in ac^&o^dance withyEhe signal portion they 
represent . 

14 . The system of clailfn 12^ wherein said data packets 
are assembled as to represent ^aid two or more signal 
portions of the input signal^ 
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15. A method for processing audio signals conlprising ; 

(a) dividing an input audio signal into a fiyst and one 
or more secondary signal portions, which in combd/iation 
provide a complete representation of the input sp.gnal, 

5 wherein a first signal portion contains informa/ion 

sufficient to reconstruct a representation of /he input 
signal ; 

(b) providing first encoded data about /he first signal 
portion, and secondary encoded data about aJL least one 

10 secondary signal portion, wherein said secondary encoded data 
further comprises information about the f/rst signal portion; 
and 

(c) combining said\first encoded da£a and said secondary 
encoded data into an output data streamj 

15 furthefr comprising the step 
tream to /reconstruct the input 



15 16. The method of cla 

of decoding the output 
signal . 

17. The method of 
are encoded at sampling' 

2 0 input signal. 

18. The method d 
performed as a Fast Four 

19. The method o 




15 wherein said signal portions 
different from that of the 



15 wMerein said dividing is 
Transfprm (FFT) computation. 

18 further comprising the step 
of decoding the output data stream using Ml decoders, 1 s Ml 
25 £ M, for providing an output sigAal that reconstructs the 
input signal from inform^^on i^ Ml signal portions of the 
input signal. 

20. The method of claim "19 wherein the output signal 
has sampling frequency that i^ 2^^ times lower than the 
30 sampling frequency of the inpcit signal. 



21. A system for embedded coding of audio signals 
comprising : 

(a) a frame extractor for dividing an input signal into 
35 a plurality of signal frames corresponding to successive time 
intervals ; 
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(b) means for providing parametric representations of 
the signal in each frame, said parametric representations 
being based on a signal model; 

(c) means for providing a first encoded data portion 
corresponding to a user-specified parametric representation, 
which first encoded data portion contains information 
sufficient to reconstruct a representation of the input 
signal ; 

(d) means for providing one or more secondary encoded 
data portions of the user-selected parametric representation; 
and 

(e) means for providing an embedded output signal based 
at least on said first encoded data portion and said one or 
more secondary encoded ^^a portions of the user- selected 
parametric representation , \^ 

22. The system of claim 21 further comprising: 

(f ) means for provax^ng representations of the signal in 
each frame, which are not based on a signal model, 

23. The system of claims 22 further comprising 

(g) means for selecting a specific one from the 
representations in (m^^^and (f) based on user- selected 
constraints . 

24. The system of c3/kim 21 wherein said means for 
providing parametric representations of the signal in each 
frame comprises a pitch detector for computing a first 
estimate of the pitch of a signal in each frame; means for 
determining parameters of sinusoids representing the signal 
in each frame; and a spfe^trum envelope encoder for encoding 
the shape of the envelope Si^ the signal in each frame. 

25. The system of claitn 21 wherein said means for 
providing an embedded output signal comprises a bit stream 
assembler for providing an output bit stream containing user- 
specified information about parameters of at least one 
sinusoid in the spectrum of the input signal, and about 
parameters representing a spectrum envelope of the signal in 
each frame . 
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26. The system of claim 21 further comprising means for 
decoding the embedded oubmit signal. 

27. The system of claism 26 wherein said means for 
decoding operate at a sampling frequency different from a 
sampling frequency of the\input signal. 

28. The system of claim 21 wherein said means for 
providing an embedded output signal comprises means for 
assembling data packets suitable for transmission over a 
packet - switched network . 



10 



29. A method for multistage vector quantization of 
signals comprising : 

% (a) passing an input signal through a first stage of a 

=^ multistage vector quantizer having a predetermined set of 

'1: 15 codebook vectors, each vector corresponding to a Voronoi 

:Z cell, to obtain error vectors corresponding to differences 

=^ between a codebook vector and an input signal vector falling 

within a Voronoi cell; 
^ (b) determining probability density functions (pdfs) for 

=y 20 the error vectors in at least two Voronoi cells; 

:y (c) transforming error vectors using a transformation 

based on the pdfs determined for said at least two Voronoi 
cells; and 

(d) passing transformed error vectors through at least a 
25 second stage of the multistage vector quantizer to provide a 
quantized output signal 

30. The method of claim\29 further comprising the step 
of performing an inverse transformation on the quantized 
output signal to reconst^^ct a representation of the input 

30 signal. 

31. The method Nof claim\9 wherein in step (c) the 
transformation compriseb^ scaling the sizes of said at least 
two Voronoi cells as to ap^oximately equalize these sizes. 

32. The method of claims^31 wherein scaling factor for a 
35 Voronoi cell is determined as the inverse of an average for 

the Euclidean distance between the codebook vector for the 
Voronoi cell and a set of training vectors. 

- 108 - PENY4-744609. 1 




33. The method of claim 29 wherein in step (c) the 
transformation comprises -^rotating the error vector at an 
angle, which is determinedNDy the Voronoi cell. 

34. The method of clainv 33 wherein the rotation angle 
5 is determined as the angle between the codebook vector for 

the Voronoi cell and one of the coordinate axes of the cell. 

35. The method of clai\ 29 wherein in step (c) the 
transformation comprise sNboth scaling and rotating the error 
vector at given angle. \^ 

10 36. The method of claim 29 wherein in step (c) a 

transformation for inne^>s^oronoi cells is different a 
transformation for outer v^onoi cells. 

37. The method of claiW 29 wherein in step (c) the 
transformation is performed using tuning of translation and 

15 rotation parameters as to maximally align boundaries of 
scaled Voronoi regions and slopes of pdfs in each Voronoi 
region. 

38. A system for processing audio signals comprising; 

2 0 (a) a frame extractor for dividing an input audio signal 

into a plurality of signal frames corresponding to successive 
time intervals; 

(b) a frame mode classifier for determining if the 
signal in a frame is in a transition state; 

25 (c) a processor for extracting parameters of the signal 

in a frame receiving input from said classifier, wherein for 
frames the signal of which is determined to be in said 
transition state said extracted parameters include phase 
information; and 

30 (d) a multi-mode coder in which extracted parameters of 

the signal in a frame are processed in at least two distinct 
paths dependent on whether the frame signal is determined to 
be in a transition state. 

39. The system of claa^ 38 wherein said extracted 
35 parameters comprise gain, pitch and voicing parameters and 

parameters related to Linear Prediction Coefficients (LPCs) . 
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K L K-1 

;c=l i=l icM 

40. The system of claim^S wherein said frame 
^ mode classifier receiv^s^nput from said processor for 
extracting parameters and outputs at least one state flag. 

41. The system of claim 40 wherein the multi-mode coder 
determines one of said at least two distinct processing paths 
on the basis of said at le^t. one state flag. 

42. The system of claim 38 further comprising a decoder 
for decoding signals ^^l^at least two distinct processing 
paths . 

43. The system of cladtip 38 wherein said distinct 
processing paths include distinct bit allocation for frames 
determined to be in different states. 

44. A system for processing audio signals comprising: 
(a) a frame extractor for dividing an input signal into 

a plurality of signal frames corresponding to successive time 
2Q intervals; 

jj (b) means for providing a parametric representation of 

the signal in each frame, said parametric representation 
being based on a signal model; 

(c) a non-linear processor for providing refined 
2g estimates of parameters of the parametric representation of 

the signal in each frarh^ and 

(d) means for encodin^said refined parameter estimates. 

45. The system o£ clair\ 44 wherein said refined 
estimates comprises an e^<t;imate of the pitch. 

3Q 46. The system of cla^ 44 wherein said refined 

estimates comprises a^Kestimate of a voicing parameter for 
the input speech signal .\s^^ 

47. The system of claitR 44 wherein said refined 
estimates comprises an estimate of a pitch onset time for an 
2g input speech signal. 
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48. The system of claim 44 wherein said non-linear 
processor computes the maximum of a correlation function of 
the input signal over a set\of complex frequencies. 

49. The system ^fs^^clairrN 4 5 wherein the computation is 
5 done iteratively. \v 

50. The system of cla^m 44 wherein a measure of voicing 
for the input signal is computed as 

p(o)o) = X: \yJ' 0.5*[l+cos(2nG)„/a)o)]/X: I^^J' 

10 



where are complex amplitudes of the output of a nonlinear 
operation defined over the input signal s (n) as defined 



K L K-l 



k=l i=l k=l 

K L K-l 

20 =^A52 Yjtexp(jiacoj^) +5^ 5] Yjt.iY^ exp[j/3(a>jt,j-a)^) ] 



k=l 1=1 k=l 

(1) 



where = exp(j0j^) is the complex amplitude and where 0 ^ 
2g M ^ 1 is a bias factor. 



30 



35 
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